<!DOCTYPE html>
<!-- Generated by pkgdown: do not edit by hand --><html lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta charset="utf-8"><meta http-equiv="X-UA-Compatible" content="IE=edge"><meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"><title>frs_by_naics (DATA) data.table of NAICS code(s) for each EPA-regulated site in Facility Registry Service — frs_by_naics • EJAM</title><!-- favicons --><link rel="icon" type="image/png" sizes="16x16" href="../favicon-16x16.png"><link rel="icon" type="image/png" sizes="32x32" href="../favicon-32x32.png"><link rel="apple-touch-icon" type="image/png" sizes="180x180" href="../apple-touch-icon.png"><link rel="apple-touch-icon" type="image/png" sizes="120x120" href="../apple-touch-icon-120x120.png"><link rel="apple-touch-icon" type="image/png" sizes="76x76" href="../apple-touch-icon-76x76.png"><link rel="apple-touch-icon" type="image/png" sizes="60x60" href="../apple-touch-icon-60x60.png"><script src="../deps/jquery-3.6.0/jquery-3.6.0.min.js"></script><meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"><link href="../deps/bootstrap-5.3.1/bootstrap.min.css" rel="stylesheet"><script src="../deps/bootstrap-5.3.1/bootstrap.bundle.min.js"></script><link href="../deps/font-awesome-6.4.2/css/all.min.css" rel="stylesheet"><link href="../deps/font-awesome-6.4.2/css/v4-shims.min.css" rel="stylesheet"><script src="../deps/headroom-0.11.0/headroom.min.js"></script><script src="../deps/headroom-0.11.0/jQuery.headroom.min.js"></script><script src="../deps/bootstrap-toc-1.0.1/bootstrap-toc.min.js"></script><script src="../deps/clipboard.js-2.0.11/clipboard.min.js"></script><script src="../deps/search-1.0.0/autocomplete.jquery.min.js"></script><script src="../deps/search-1.0.0/fuse.min.js"></script><script src="../deps/search-1.0.0/mark.min.js"></script><!-- pkgdown --><script src="../pkgdown.js"></script><meta property="og:title" content="frs_by_naics (DATA) data.table of NAICS code(s) for each EPA-regulated site in Facility Registry Service — frs_by_naics"><meta name="description" content="This is the format with one row per site-NAICS pair,
so multiple rows for one site if it is in multiple NAICS.
@details
This file is not stored in the package, but is obtained via dataload_from_pins().
The EPA also provides a FRS Facility Industrial Classification Search tool
where you can find facilities based on NAICS or SIC.
MOST SITES LACK NAICS INFO IN FRS! NAICS is missing for about 80 percent of these facilities.
frs here had about 2.5 million unique REGISTRY_ID values, but
frs_by_naics had only about 700k rows
about 562,000 unique REGISTRY_ID values with
about 2,900 unique NAICS codes.
length(unique(frs_by_naics$REGISTRY_ID))
length(unique(frs_by_naics[,REGISTRY_ID]))
length(frs_by_naics[, unique(REGISTRY_ID)])
frs_by_naics[,uniqueN(REGISTRY_ID)]
   e.g., 573,411 in mid 2024

    lat       lon  REGISTRY_ID  NAICS

1: 34.04722 -81.15136 110000854246 325211
2: 34.04722 -81.15136 110000854246 325220
3: 34.04722 -81.15136 110000854246 325222"><meta property="og:description" content="This is the format with one row per site-NAICS pair,
so multiple rows for one site if it is in multiple NAICS.
@details
This file is not stored in the package, but is obtained via dataload_from_pins().
The EPA also provides a FRS Facility Industrial Classification Search tool
where you can find facilities based on NAICS or SIC.
MOST SITES LACK NAICS INFO IN FRS! NAICS is missing for about 80 percent of these facilities.
frs here had about 2.5 million unique REGISTRY_ID values, but
frs_by_naics had only about 700k rows
about 562,000 unique REGISTRY_ID values with
about 2,900 unique NAICS codes.
length(unique(frs_by_naics$REGISTRY_ID))
length(unique(frs_by_naics[,REGISTRY_ID]))
length(frs_by_naics[, unique(REGISTRY_ID)])
frs_by_naics[,uniqueN(REGISTRY_ID)]
   e.g., 573,411 in mid 2024

    lat       lon  REGISTRY_ID  NAICS

1: 34.04722 -81.15136 110000854246 325211
2: 34.04722 -81.15136 110000854246 325220
3: 34.04722 -81.15136 110000854246 325222"><meta property="og:image" content="https://usepa.github.io/EJAM/logo.svg"></head><body>
    <a href="#main" class="visually-hidden-focusable">Skip to contents</a>


    <nav class="navbar navbar-expand-lg fixed-top bg-light" data-bs-theme="light" aria-label="Site navigation"><div class="container">

    <a class="navbar-brand me-2" href="../index.html">EJAM</a>

    <small class="nav-text text-warning me-auto" data-bs-toggle="tooltip" data-bs-placement="bottom" title="Released version">2.32.0</small>


    <button class="navbar-toggler" type="button" data-bs-toggle="collapse" data-bs-target="#navbar" aria-controls="navbar" aria-expanded="false" aria-label="Toggle navigation">
      <span class="navbar-toggler-icon"></span>
    </button>

    <div id="navbar" class="collapse navbar-collapse ms-3">
      <ul class="navbar-nav me-auto"><li class="active nav-item"><a class="nav-link" href="../reference/index.html">Reference</a></li>
<li class="nav-item dropdown">
  <button class="nav-link dropdown-toggle" type="button" id="dropdown-articles" data-bs-toggle="dropdown" aria-expanded="false" aria-haspopup="true">Articles</button>
  <ul class="dropdown-menu" aria-labelledby="dropdown-articles"><li><hr class="dropdown-divider"></li>
    <li><h6 class="dropdown-header" data-toc-skip>Overview for EJAM Users</h6></li>
    <li><a class="dropdown-item" href="../articles/0_whatis.html">What is EJAM</a></li>
    <li><a class="dropdown-item" href="../articles/0_webapp.html">Using EJAM</a></li>
    <li><hr class="dropdown-divider"></li>
    <li><h6 class="dropdown-header" data-toc-skip>For analysts using R</h6></li>
    <li><a class="dropdown-item" href="../articles/1_installing.html">Installing the EJAM R package</a></li>
    <li><a class="dropdown-item" href="../articles/2_quickstart.html">Quick Start Guide</a></li>
    <li><a class="dropdown-item" href="../articles/3_analyzing.html">Basics of Using EJAM for Analysis in RStudio</a></li>
    <li><a class="dropdown-item" href="../articles/4_advanced.html">Advanced Features</a></li>
  </ul></li>
<li class="nav-item"><a class="nav-link" href="../news/index.html">Changelog</a></li>
      </ul><ul class="navbar-nav"><li class="nav-item"><form class="form-inline" role="search">
 <input class="form-control" type="search" name="search-input" id="search-input" autocomplete="off" aria-label="Search site" placeholder="Search for" data-search-index="../search.json"></form></li>
<li class="nav-item"><a class="external-link nav-link" href="https://github.com/USEPA/EJAM/" aria-label="GitHub"><span class="fa fab fa-github fa-lg"></span></a></li>
      </ul></div>


  </div>
</nav><div class="container template-reference-topic">
<div class="row">
  <main id="main" class="col-md-9"><div class="page-header">
      <img src="../logo.svg" class="logo" alt=""><h1>frs_by_naics (DATA) data.table of NAICS code(s) for each EPA-regulated site in Facility Registry Service</h1>
      <small class="dont-index">Source: <a href="https://github.com/USEPA/EJAM/blob/HEAD/R/data_frs_by_naics.R" class="external-link"><code>R/data_frs_by_naics.R</code></a></small>
      <div class="d-none name"><code>frs_by_naics.Rd</code></div>
    </div>

    <div class="ref-description section level2">
    <p>This is the format with one row per site-NAICS pair,
so multiple rows for one site if it is in multiple NAICS.
@details
This file is not stored in the package, but is obtained via <code><a href="dataload_from_pins.html">dataload_from_pins()</a></code>.</p>
<p>The EPA also provides a <a href="https://www.epa.gov/frs/frs-query#industrial" class="external-link">FRS Facility Industrial Classification Search tool</a>
where you can find facilities based on NAICS or SIC.</p>
<p>MOST SITES LACK NAICS INFO IN FRS! NAICS is missing for about 80 percent of these facilities.</p>
<p>frs here had about 2.5 million unique REGISTRY_ID values, but</p>
<p>frs_by_naics had only about 700k rows</p>
<p>about 562,000 unique REGISTRY_ID values with</p>
<p>about 2,900 unique NAICS codes.</p>
<p>length(unique(frs_by_naics$REGISTRY_ID))</p>
<p><code>length(unique(frs_by_naics[,REGISTRY_ID]))</code></p>
<p><code>length(frs_by_naics[, unique(REGISTRY_ID)])</code></p>
<p><code>frs_by_naics[,uniqueN(REGISTRY_ID)]</code></p>
<p></p><div class="sourceCode"><pre><code><span id="cb1-1"><a href="#cb1-1" tabindex="-1"></a>   e.g., <span class="dv">573</span>,<span class="dv">411</span> <span class="cf">in</span> mid <span class="dv">2024</span></span>
<span id="cb1-2"><a href="#cb1-2" tabindex="-1"></a></span>
<span id="cb1-3"><a href="#cb1-3" tabindex="-1"></a>    lat       lon  REGISTRY_ID  NAICS</span></code></pre><p></p></div>
<p>1: 34.04722 -81.15136 110000854246 325211</p>
<p>2: 34.04722 -81.15136 110000854246 325220</p>
<p>3: 34.04722 -81.15136 110000854246 325222</p>
    </div>


    <div class="section level2">
    <h2 id="see-also">See also<a class="anchor" aria-label="anchor" href="#see-also"></a></h2>
    <div class="dont-index"><p><a href="frs.html">frs</a> <code><a href="frs_from_naics.html">frs_from_naics()</a></code> <code><a href="naics_categories.html">naics_categories()</a></code> <a href="frs_by_programid.html">frs_by_programid</a> and see naics_from_any in EJAM pkg.</p></div>
    </div>

    <div class="section level2">
    <h2 id="ref-examples">Examples<a class="anchor" aria-label="anchor" href="#ref-examples"></a></h2>
    <div class="sourceCode"><pre class="sourceCode r"><code><span> <span class="co"># NAICS is missing for about 80 percent of facilities</span></span>
<span> <span class="va">`frs[ NAICS == "", .N] / frs[,.N] `</span></span>
<span> <span class="co"># only about 562k facilities have some NAICS info</span></span>
<span> <span class="va">`frs[ NAICS != "", .N]`</span></span>
<span> <span class="va">`frs_by_naics[, uniqueN(REGISTRY_ID)]`</span> <span class="co"># almost exactly matches the above</span></span>
<span> </span>
<span> <span class="fu"><a href="https://rdrr.io/r/base/dim.html" class="external-link">dim</a></span><span class="op">(</span><span class="va">frs_by_naics</span><span class="op">)</span> </span>
<span> <span class="co"># about 680k rows here, or pairs of 1 NAICS - 1 registry ID pair,</span></span>
<span> <span class="co">#  since some IDs have 2 or more NAICS so appear as 2 or more rows here.</span></span>
<span> </span>
<span> <span class="co"># About 2,900 different NAICS codes appear here:</span></span>
<span> <span class="va">`frs_by_naics[,  uniqueN(NAICS)]`</span></span>
<span> <span class="va">`frs_by_naics[, .(sum(.N &gt; 1)), by=NAICS][,sum(V1)]`</span></span>
<span>   <span class="co">#  2,457 NAICS codes are used to describe more than one Registry ID</span></span>
<span>  <span class="va">`frs_by_naics[, .(sum(.N == 1)), by=NAICS][,sum(V1)]`</span></span>
<span>   <span class="co"># [1] 425 NAICS codes appear only once, i.e., apply to only a single facility!</span></span>
<span>   </span>
<span> <span class="co"># Which 2-digit NAICS are found here most often?</span></span>
<span> <span class="va">`frs_by_naics[ , .N, keyby=substr(NAICS,1,2)]`</span></span>
<span> <span class="va">`frs_by_naics[ , .N,   by=substr(NAICS,1,2)][order(N),]`</span> <span class="co"># Most common is 33</span></span>
<span> <span class="co"># Top 10 most common 3-digit NAICS here:</span></span>
<span> <span class="va">`x = tail(frs_by_naics[ , .N,   by=.(n3 = substr(NAICS,1,3))][order(N), ],10)`</span></span>
<span> <span class="va">`cbind(x, industry = rownames(naics_categories(3))[match(x$n3, naics_categories(3))])`</span></span></code></pre></div>
    </div>
  </main><aside class="col-md-3"><nav id="toc" aria-label="Table of contents"><h2>On this page</h2>
    </nav></aside></div>


    <footer><div class="pkgdown-footer-left">
  <p>US EPA 2024</p>
</div>

<div class="pkgdown-footer-right">
  <p>EJAM Version 2.32.0</p>
</div>

    </footer></div>





  </body></html>

